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ABSTRACT 


While the novel Middle East Respiratory Syndrome Coronavims (MERS-CoV) is closely 
related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV 
HKU5 (Pi-BatCoV HKU5) in bats from Hong Kong, and other potential lineage C 
betacoronaviruses in bats from Africa, Europe and America, its animal origin remains 
obscure. To better understand the role of bats in its origin, we examined the molecular 
epidemiology and evolution of lineage C betacoronaviruses among bats. Ty-BatCoV 
HKU4 and Pi-BatCoV HKU5 were detected in 29% and 25% of alimentary samples from 
lesser bamboo bat ( Tylonycteris pachypus) and Japanese pipistrelle {Pipistrellus abramus ) 
respectively. Sequencing of their RdRp, S and N genes revealed that MERS-CoV is more 
closely related to Pi-BatCoV HKU5 in RdRp (92.1-92.3% aa identities) but to Ty- 
BatCoV HKU4 in S (66.8-67.4% aa identities) and N (71.9-72.3% aa identities). 
Although both viruses were under purifying selection, the S of Pi-BatCoV HKU5 
displayed marked sequence polymorphisms and more positively selected sites than that of 
Ty-BatCoV HKU4, suggesting that Pi-BatCoV HKU5 may generate variants to occupy 
new ecological niches along with its host which faces diverse habitats. Molecular clock 
analysis showed that they diverged from a common ancestor with MERS-CoV at least 
several centuries ago. Although MERS-CoV may have diverged from potential lineage C 
betacoronaviruses in European bats more recently, these bat viruses were unlikely the 
direct ancestor of MERS-CoV. Intensive surveillance for lineage C betaCoVs in 
Pipistrellus and related bats with diverse habitats, and other animals from the Middle 
East may fill the evolutionary gap. 
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INTRODUCTION 


Coronavimses (CoVs) infect humans and a wide variety of animals, causing respiratory, 
enteric, hepatic and neurological diseases of varying severity. They have been classified 
traditionally into groups 1, 2 and 3, based on genotypic and serological characteristics (1, 
2). Recently, the nomenclature and taxonomy of CoVs have been revised by the 
Coronavirus Study Group of the International Committee for Taxonomy of Viruses 
(ICTV). They are now classified into three genera, Alphacoronavirus, Betacoronavirus 
and Gammacoronaviriis, replacing the three traditional groups (3). Novel CoVs, which 
represented a novel genus, Deltacoronavirus, have also been identified (4, 5). While 
CoVs from all four genera can be found in mammals, bat CoVs are likely the gene source 
of Alphacoronavirus and Betacoronavirus, and avian CoVs are the gene source of 
Gammacoronavirus and Deltacoronavirus (5-7). 

CoVs are well known for their high frequency of recombination and mutation 
rates, which may allow them to adapt to new hosts and ecological niches (1, 8-12). This 
is best exemplified by the severe acute respiratory syndrome (SARS) epidemic, which 
was caused by SARS CoV (13, 14). The virus has been shown to be originated from 
animals, with horseshoe bats as the natural reservoir and palm civet as the intermediate 
host allowing animal-to-human transmission (15-18). Since the SARS epidemic, many 
other novel CoVs in both humans and animals have been discovered (4, 7, 19-24). In 
particular, a previously unknown diversity of CoVs have been described in bats from 
China and other countries, suggesting that bats are important reservoirs of alphaCoVs and 
betaCoVs (16, 18, 25-32). 
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In September 2012, two cases of severe community-acquired pneumonia were 
reported in Saudi Arabia, which were subsequently found to be caused by a novel CoV, 
Middle East Respiratory Syndrome Coronavirus (MERS-CoV), previously known as 
human betaCoV 2c EMC/2012 (33, 34, 35). As of May 2013, a total of 40 laboratory 
confirmed cases of MERS-CoV infection have been reported with 20 deaths (36), giving 
a crude fatality rate of 50%. So far, most cases of MERS-CoV infection presented with 
severe acute respiratory illness (36, 37). A macaque model for MERS-CoV infection has 
also been established, which showed that the virus caused localized-to-widespread 
pneumonia in all infected animals (38). The viral virulence may be related to the ability 
of MERS-CoV to evade the innate immunity with attenuated interferon-f! response (39- 
41). Moreover, the ability to cause human-to-human transmission has raised the 
possibility of another SARS-like epidemic (36, 37). However, the source of this novel 
CoV is still obscure, which has hindered public health and infection control strategies for 
disease prevention. Phylogenetically, MERS-CoV belongs to Betacoronavirus lineage C, 
being closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and 
Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5) previously discovered in lesser bamboo 
bat ( Tylonycteris pachypus) and Japanese pipistrelle (Pipistrellus abramus ) in Hong 
Kong, China respectively (31, 32, 42, 43). Moreover, potential viruses with partial gene 
sequences closely related to MERS-CoV have also been detected in bats from Africa, 
Europe and America, although complete genome sequences were not available (44, 45). 
MERS-CoV is able to infect various mammalian cell lines including primate, porcine, bat 
and rabbit cells, which may be explained by the use of the evolutionarily conserved 
dipeptidyl peptidase 4 (DPP4) as its functional receptor (46, 47). These suggested that 
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MERS-CoV may possess broad species tropism and have emerged from animals. 
However, the direct ancestor vims and animal reservoir of MERS-CoV is yet to be 
identified. 

To better understand the evolutionary origin of MERS-CoV and the possible role 
of bats as the reservoir for its ancestral viruses, studies on the genetic diversity and 
evolution of lineage C betaCoVs in bats would be important. We attempted to study the 
epidemiology of lineage C betaCoVs, including Ty-BatCoV HKU4 and Pi-BatCoV 
HKU5, among various bat species in Hong Kong, China. The complete RNA-dependent 
RNA polymerase (RdRp), spike (S) and nucleocapsid (N) genes of 13 Ty-BatCoV HKU4 
and 15 Pi-BatCoV HKU5 strains were sequenced to assess their genetic diversity and 
evolution. The results revealed that the two viruses were stably evolving in their 
respective hosts, and have diverged from their common ancestor long time ago. However, 
the S protein of Pi-BatCoV HKU5 exhibited marked sequence divergence and much 
more positively selected sites than that of Ty-BatCoV HKU4, which may suggest the 
ability of Pi-BatCoV HKU5 along with its host to occupy new ecological niches. The 
potential implications on the animal origin of MERS-CoV were also discussed. 
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METHODS 


Collection of bat samples. Various bat species were captured from different locations in 
Hong Kong, China over a 7-year period (April 2005 to August 2012). Their respiratory 
and alimentary specimens were collected using procedures described previously (16, 48). 
To prevent cross contamination, specimens were collected using disposable swabs with 
protective gloves changed between samples. All specimens were immediately placed in 
viral transport medium containing Earle's balanced salt solution (Invitrogen, New York, 
United States), 20% glucose, 4.4% NaHC03, 5% bovine albumin, 50000 ug/ml 
vancomycin, 50000 ug/ml amikacin, 10000 units/ml nystatin, before transportation to the 
laboratory for RNA extraction. 

RNA extraction. Viral RNA was extracted from the respiratory and alimentary 
specimens using QIAamp Viral RNA Mini Kit (QIAgen, Hilden, Germany). The RNA 
was eluted in 50 pi of AVE buffer (QIAgen) and was used as the template for RT-PCR. 

RT-PCR for CoVs and DNA sequencing. CoV detection was performed by 
amplifying a 440-bp fragment of the RdRp gene of CoVs using conserved primers (5’- 
GGTT GGGACTATCCTA AGT GTGA-3 ’ and 5’- 

CCATCATCAGATAGAATCATCATA-3’) designed by multiple alignments of the 
nucleotide sequences of available RdRp genes of known CoVs as described previously 
(17, 24). Reverse transcription was performed using the Superscript III kit (Invitrogen, 
San Diego, CA, USA). The PCR mixture (25 pi) contained cDNA, PCR buffer (10 mM 
Tris-HCl pH 8.3, 50 mM KC1, 3 mM MgCE and 0.01% gelatin), 200 pM of each dNTPs 
and 1.0 U Taq polymerase (Applied Biosystem, Foster City, CA, USA). The mixtures 
were amplified in 60 cycles of 94°C for 1 min, 48°C for 1 min and 72°C for 1 min and a 
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final extension at 72°C for 10 min in an automated thermal cycler (Applied Biosystem, 
Foster City, CA, USA). Standard precautions were taken to avoid PCR contamination 
and no false-positive was observed in negative controls. 

The PCR products were gel-purified using the QIAquick gel extraction kit 
(QIAgen, Flilden, Germany). Both strands of the PCR products were sequenced twice 
with an ABI Prism 3700 DNA Analyzer (Applied Biosystems, Foster City, CA, USA), 
using the two PCR primers. The sequences of the PCR products were compared with 
known sequences of the RdRp genes of CoVs in the GenBank database to identify 
lineage C betaCoVs. 

Sequencing and analysis of the complete RdRp, S and N genes of Ty-BatCoV 
HKU4 and Pi-BatCoV HKU5 strains. To study the genetic diversity and evolution of 
Ty-BatCoV FIKU4 and Pi-BatCoV FIKU5 detected in bats, the complete RdRp, S and N 
genes of 13 Ty-BatCoV FIKU4 strains and 15 Pi-BatCoV FIKU5 strains detected at 
different time and/or place, in addition to the nine previous strains with complete genome 
sequences, were amplified and sequenced using primers designed according to available 
genome sequences (Table 1) (32). The sequences of the PCR products were assembled 
manually to produce the complete RdRp, S and N gene sequences. Multiple sequence 
alignments were constructed using MUSCLE in MEGA version 5 (49, 50). Phylogenetic 
trees were constructed using Maximum-likelihood method (51), with bootstrap values 
calculated from 100 trees. Protein family analysis was performed using PFAM and 
InterProScan (52, 53). Prediction of transmembrane domains was performed using 
TMF1MM (54). The heptad repeat (FIR) regions were predicted by using the coiled-coil 


prediction program MultiCoil2 (55). 
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Estimation of synonymous and non-synonymous substitution rates. The 

number of synonymous substitutions per synonymous site, Ks, and the number of non- 
synonymous substitutions per non-synonymous site, Ka, for each coding region were 
calculated using the Nei-Gojobori method (Jukes-Cantor) in MEGA version 5 (50). 

Detection of positive selection. Sites under positive selection in the S gene in Ty- 
BatCoV-HKU4 and Pi-BatCoV-HKU5 were inferred using single-likelihood ancestor 
counting (SLAC), fixed effects likelihood (FEL) and random effects likelihood (REL) 
methods as implemented in DataMonkey server ( http://www.datamonkev.org ) (56). 
Positive selection for a site was considered to be statistically significant if the P-value 
was <0.1 for SLAC and FEL methods or posterior probability was >90% level for REL 
method. A mixed-effects model of evolution (MEME) was further used to identify 
positively selected sites under episodic diversifying selection in particular positions in 
sublineages within a phylogenetic tree even when positive selection is not evident across 
the entire tree (57). Positively selected sites with a P-value <0.05 were reported. 

Estimation of divergence time. As RdRp and N genes are relatively conserved 
across CoVs and therefore most likely reflect viral phylogeny, divergence time was 
calculated using complete RdRp and N gene sequence data of Ty-BatCoV FIKU4, Pi- 
BatCoV FIKU5 and MERS-CoV strains, and 904-bp partial RdRp sequence data of 
lineage C betaCoVs from European bats, with Bayesian Markov Chain Monte Carlo 
(MCMC) approach as implemented in BEAST (Version 1.7.4) as described previously (9, 
17, 21, 44, 58, 59). One parametric model (Constant Size) and one non-parametric model 
(Bayesian Skyline with five groups) tree priors were used for the inference. Analyses 
were performed under Flasegawa-Kishino-Yano (FIKY) model with coding sequence 
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partitioned into 1st + 2nd versus 3rd positions and rate variation between sites described 
by a four-category discrete gamma distribution using both strict and relaxed [uncorrelated 
lognormal (Ucld) and uncorrelated exponential (Uced)] molecular clocks. MCMC run 
was 2* 10 8 steps long, sampling every 1,000 steps. Convergence was assessed on the 
basis of the effective sampling size after a 10% burn-in using Tracer software Version 1.5 
(58). The mean time of the most recent common ancestor (tMRCA) and the highest 
posterior density regions at 95% (HPD) were calculated, and the best-fitting model was 
selected by a Bayes factor, using marginal likelihoods implemented in Tracer (60). 
Bayesian Skyline under a relaxed clock model with Uced was adopted for making 
inferences, as this model fitted the data better than other models tested by Bayes factor 
analysis (data not shown) and allowed variations in substitution rates among lineages. All 
trees were summarized in a target tree by the Tree Annotator program included in the 
BEAST package by choosing the tree with the maximum sum of posterior probabilities 
(maximum clade credibility) after a 10% burn-in. 

Nucleotide sequence accession numbers. The nucleotide sequences of the 
complete RdRp, S and N genes of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 have been 
lodged within the GenBank sequence database under accession no. KC522036 to 
KC522119. 
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RESULTS 


Detection of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 from bat samples. A total of 
5426 respiratory and 5260 alimentary specimens from 5481 bats of 21 different species 
were obtained. RT-PCR for a 440-bp fragment in the RdRp genes of CoVs detected the 
presence of lineage C betaCoVs from two bat species, including Ty-BatCoV HKU4 in 29 
(29%) of 99 alimentary samples from lesser bamboo bat (Tylonycteris pachypus) and Pi- 
BatCoV HKU5 in 55 (25%) of 216 alimentary samples from Japanese pipistrelle 
(.Pipistrellus abramus) respectively (Table 2). None of the respiratory samples were 
positive for lineage C betaCoVs. Bats positive for Ty-BatCoV HKU4 and Pi-BatCoV 
HKU5 were from seven and 13 sampling locations in Hong Kong respectively. No 
obvious disease was observed in bats positive for Ty-BatCoV HKU4 and Pi-BatCoV 
HKU5. Ty-BatCoV HKU4 was found only in adult bats while Pi-BatCoV HKU5 was 
found in both adult and juvenile bats. 

Complete RdRp, S and N gene analysis of Ty-BatCoV HKU4 and Pi-BatCoV 
HKU5 strains. To study the genetic diversity and evolution of lineage C betaCoVs in 
bats, the complete RdRp, S and N gene sequences of 13 Ty-BatCoV HKU4 strains and 15 
Pi-BatCoV HKU5 strains were sequenced. Comparison of the deduced aa sequences of 
the RdRp, S and N genes of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 to those of 
MERS-CoV showed that MERS-CoV is more closely related to Pi-BatCoV HKU5 than 
to Ty-BatCoV HKU4 (92.1-92.3% versus 89.6-90% identities) in the RdRp gene, but 
more closely related to Ty-BatCoV HKU4 than to Pi-BatCoV HKU5 in the S (66.8- 
67.4% versus 63.4-64.5% identities) and N (71.9-72.3% versus 69.5-70.5% identities) 
genes (Table 3). Moreover, MERS-CoV is more closely related to Ty-BatCoV HKU4 and 
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Pi-BatCoV HKU5 belonging to Betacoronavirus lineage C than to CoVs belonging to 
Betacoronavirus lineages A, B and D (Table 3). Phylogenetic analysis of the complete 
RdRp, S and N gene sequences of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 showed that 
the sequences from the 13 Ty-BatCoV HKU4 strains and 15 Pi-BatCoV HKU5 strains 
formed two distinct clusters in all three genes, being closely related to each other and to 
MERS-CoV (Fig. 1). Interestingly, unlike the S genes of the 13 Ty-BatCoV HKU4 
strains which shared highly similar sequences with very short branch lengths, the S genes 
of Pi-BatCoV HKU5 displayed marked sequence polymorphisms among the 15 strains, 
with up to 14% nucleotide and 12% amino acid (aa) differences. 

The S proteins of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 encoded 1350-1352 
and 1352-1359 aa respectively. A potential cleavage site, though not perfectly conserved, 
could be present in the S proteins of Ty-BatCoV HKU4 (S[TM]FR) and Pi-BatCoV 
HKU5 (R[VFL][ALR]R). InterProScan analysis predicted them as type I membrane 
glycoproteins, with most of the protein (residues 18/21/22 to 1294/1296/1297 for Ty- 
BatCoV HKU4 and residues 22 to 1296/1297/1298/1301/1302/1303 for Pi-BatCoV 
HKU5) exposed on the outside of the virus, a transmembrane domain (residues 
1295/1297/1298 to 1317/1319/1320 for Ty-BatCoV HKU4 and residues 
1297/1298/1299/1302/1303/1304 to 1319/1320/1321/1324/1325/1326 for Pi-BatCoV 
HKU5 ) at the C terminus, followed by a cytoplasmic tail rich in cysteine residues. Two 
heptad repeats (HR), important for membrane fusion and viral entry (61), were located at 
residues 978/980 to 1124/1126 (HR1) and 1251/1253 to 1285/1287 (HR2) for Ty- 
BatCoV HKU4, and residues 978/979/983/984 to 1124/1125/1129/1130 (HR1) and 
1253/1254/1258/1259 to 1287/1288/1292/1293 (HR2) for Pi-BatCoV HKU5. All 
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cysteine residues are conserved between the S of Ty-BatCoV HKU4, Pi-BatCoV HKU5 
and MERS-CoV. While CoVs are known to utilize a variety of host receptors for cell 
entry, a number of closely related as well as distantly related CoVs may utilize the same 
receptor. For example, aminopeptidase N (CD13) has been shown to be the receptor for 
various alphaCoVs including HCoV 229E, canine CoV (CCoV), feline infectious 
peritonitis virus (FIPV), porcine epidemic diarrhea coronavirus (PEDV) and 
transmissible gastroenteritis coronavirus (TGEV) (62, 63). Moreover, human angiotensin¬ 
converting enzyme 2 (hACE2) has been found to be the receptor for both FICoV NL63, 
an alphaCoV, as well as SARS CoV, a betaCoV, although they utilize different receptor¬ 
binding sites (64, 65). As for lineage A betaCoVs, FICoV OC43 and the closely related, 
bovine CoV utilize N-acetyl-9-O acetyl neuramic acid as receptor, whereas 
carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) is the receptor 
for mouse hepatitis virus (MF1V) (66-70). The S proteins of Ty-BatCoV FIKU4 and Pi- 
BatCoV FIKU5 as well as MERS-CoV did not exhibit significant sequence homology to 
the known RBDs of other CoVs including the betaCoVs such as SARS CoV and FICoV 
OC43 (71-78). Recently, DPP4 has been identified as a functional receptor for MERS- 
CoV, although the exact receptor-binding domain is still unknown (47, 79). Based on the 
X-ray crystal structure of the RBD domain in the SARS CoV S protein, residues 377 to 
662 have been predicted as a possible RBD for MERS-CoV (80). Using the same 
methodology, residues 387 to 587 in Ty-BatCoV FIKU4 S protein and residues 389 to 
580 Pi-BatCoV FIKU5 S protein were predicted as their possible RBDs. Flowever, further 
studies are required to elucidate the receptors for Ty-BatCoV FIKU4 and Pi-BatCoV 
HKU5 and their RBDs. 
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Estimation of synonymous and non-synonymous substitution rates. In line 
with phylogenetic analysis, multiple alignment of the S gene sequences showed that Pi- 
BatCoV HKU5 possessed more synonymous and non-synonymous substitutions than Ty- 
BatCoV HKU4 (Table 4). Compared to Ty-BatCoV HKU4 in which 58 aa positions 
contained substitutions, 253 aa positions in Pi-BatCoV HKU5 contained substitutions 
among which >2 aa were encoded at 67 aa positions (Fig. 2 and 3). The Ka/Ks ratios for 
the RdRp, S and N genes among different strains of Ty-BatCoV HKU4 and Pi-BatCoV 
HKU5 were determined (Table 4). The Ka/Ks ratios were generally low, although the S 
genes of both viruses showed relatively higher ratios (0.118) compared to RdRp and N 
genes. This suggested that these genes were under purifying selection. Nevertheless, the 
Ka and Ks of the S genes of Pi-BatCoV HKU5 were relatively high compared to those of 
Ty-BatCoV HKU4, which reflected the marked sequence polymorphisms among 
different strains. 

Detection of positive selection in S genes. The S genes of Pi-BatCoV HKU5 
possessed more positively selected sites than the S genes of Ty-BatCoV HKU4 (Fig. 4). 
Only two and five aa positions in Ty-BatCoV HKU4 were found to be under positive 
selection using REL and MEME methods respectively, whereas no significant positive 
selection was identified by SLAC and FEL methods. In contrast, two, 12, 27 and 43 aa 
positions in Pi-BatCoV HKU5 were found to be under positive selection using SLAC, 
FEL, REL and MEME methods respectively. Most of these sites were distributed within 
the SI domain, indicating that this domain may have been under functional constraints. 

Estimation of divergence time. To estimate the divergence time of Ty-BatCoV 
HKU4, Pi-BatCoV HKU5 and MERS-CoV strains, their complete RdRp and N gene 
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292 sequences were subject to molecular clock analysis using the relaxed clock model with 

293 Uced. Using complete RdRp gene sequences, tMRCA of MERS-CoV and Pi-BatCoV 

294 HKU5 was estimated at 1520.09 (HPDs, 745.73 to 1956.12) (Fig. 5A). Using complete N 

295 gene sequences, tMRCA of MERS-CoV, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 was 

296 estimated at 1323.51 (HPDs, 383.58 to 1897.75) (Fig. 5B). Since partial RdRp gene 

297 sequences closely related to the corresponding sequence of MERS-CoV have recently 

298 been detected in European bats, molecular clock analysis was also performed to estimate 

299 their divergence time. Using the 904-bp partial RdRp sequences, tMRCA of MERS-CoV 

300 and three European bat CoV strains (BtCoV 8-691, BtCoV 8-724 and BtCoV UKR-G17) 

301 was estimated at 1859.32 (HPDs, 1636.67 to 1987.55) (Fig. 5C). The estimated mean 

302 substitution rate of the complete RdRp and N gene, and partial RdRp sequence data set 

303 was 5.12><10" 4 , 8.642><10" 4 and 7.407><10" 4 substitution per site per year, comparable to 

304 that observed in other CoVs (9, 17, 59, 81, 82). 
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DISCUSSION 


In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were found to be highly 
prevalent among lesser bamboo bat and Japanese pipistrelle in Hong Kong respectively, 
with detection rates of 25-29% in their alimentary samples. In line with previous studies, 
MERS-CoV is closely related to Betacoronavirns lineage C than to lineages A, B and D 
in the RdRp, S and N genes (34, 42, 43). Nevertheless, the genetic distance between 
MERS-CoV and the various strains of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 was still 
large, with their S proteins having <67.4% aa identities. Two recent studies have 
identified partial gene sequences closely related to MERS-CoV in bats from Africa, 
Europe and America, suggesting that lineage C betaCoVs are distributed in bats 
worldwide (44, 45). In one study, Co Vs related to MERS-CoV were detected in 46 
(24.9%) Nycteris bats and 40 (14.7%) Pipistrellus bats from Ghana and Europe using RT- 
PCR targeting a 398-bp fragment of the RdRp gene (44). The extended 904-bp RdRp 
sequences of three strains from Romania and Ukraine showed that they shared 87.7- 
88.1% nucleotide and 98.3% amino acid identities to MERS-CoV, compared to 80.3- 
82%/82.4-83.7% nucleotide and 92-92.4%/94-94.4% amino acid identities between Ty- 
BatCoV HKU4/Pi-BatCoV HKU5 and MERS-CoV respectively in the corresponding 
regions. In another study, screening of 606 bats from Mexico showed the presence of a 
betaCoV also closely related MERS-CoV in a Nyctinomops lacticaudatus bat (45). 
Although the authors claimed the use of a 329-bp fragment of the RdRp gene for RT- 
PCR and sequence analysis, the available sequence was in fact within nspl4. Analysis of 
this partial nspl4 sequence showed that it shared 85.7% nucleotide and 95.5% amino acid 
identities to MERS-CoV (45), compared to to 81.9%/83.4-84.2% nucleotide and 
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88.6%/92% amino acid identities differences between Ty-BatCoV HUK4/Pi-BatCoV 
HKU5 and MERS-CoV respectively in the corresponding regions. However, complete 
gene sequences were not available from these bat Co Vs to allow more detailed 
phylogenetic analysis. Molecular clock analysis of the complete RdRp gene dated the 
tMRCA of MERS-CoV and Pi-BatCoV HKU5 at around 1520, whereas analysis of the N 
gene dated the tMRCA of MERS-CoV, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 at 
around 1324. Using the 904-bp RdRp sequences available from the three European 
strains, the tMRCA of MERS-CoV and European bat CoV strains were dated at around 
1859. Our results suggested that Ty-BatCoV HKU4, Pi-BatCoV HKU5 and MERS-CoV 
have diverged at least centuries ago from their common ancestor. Although MERS-CoV 
and the European bat CoV strains were estimated to have diverged more recently, this is 
unlike the situation in SARS-related CoVs which only diverged between civet and bat 
strains several years before the SARS epidemic (17). Therefore, these bat lineage C 
betaCoVs were unlikely the direct ancestor of MERS-CoV. However, the present analysis 
is limited by the lack of more sequences from potential intermediate virus species/strains 
with widely distributed and well-determined dates, which better reflect the different 
selective pressures over the long period of time as these viruses evolved. Further studies 
on bats and other animals are required to till the gap between these bat lineage C 
betaCoVs and MERS-CoV during their evolution. Moreover, longer gene or complete 
genome sequence data from these animal viruses would be important for more accurate 
taxonomic and evolutionary studies. 

The divergent sequences of the S genes of Pi-BatCoV HKU5 may suggest that the 
virus has a better ability to generate variants to occupy new ecological niches. The S 
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proteins of Co Vs are responsible for receptor binding and host adaptation, and are 
therefore one of the most variable regions within CoV genomes (16, 18, 28). Studies on 
SARS CoV have shown that changes in its S protein, both within and outside of receptor 
binding domain, could govern CoV cross-species transmission and emergence in new 
host populations (83, 84). We have also previously demonstrated recent interspecies 
transmission of an alphaCoV, BatCoV HKU10, from Leschenault’s rousettes to Pomona 
leaf-nosed bats, and the virus has been rapidly adapting in the new host by changing its S 
protein (59). In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were exclusively 
detected in lesser bamboo bat (Tylonycteris pachypus ) and Japanese pipistrelle 
(Pipistrellus abramus ) respectively. Moreover, the Ka/Ks ratios of the RdRp, S and N 
genes in both viruses were low, supporting that the two bat species were the respective 
primary reservoirs for the two CoVs. Nevertheless, unlike that of Ty-BatCoV HKU4, the 
S gene of Pi-BatCoV HKU5 exhibited much higher sequence divergence among different 
strains due to both synonymous and non-synonymous substitutions. Moreover, a much 
higher number of positively selected sites were observed in the S gene of Pi-BatCoV 
HKU5 than that of Ty-BatCoV HKU4, with most of the sites under selection being 
distributed within the SI region which likely contains the RBD. This suggested that the 
SI region of Pi-BatCoV HKU5 may have been under functional constraints in its host 
species, Japanese pipistrelle, which may have favored adaptation to new 
host/environments. 

The marked polymorphisms in the S protein of Pi-BatCoV HKU5 may reflect the 
biological characteristics of its host species, Japanese pipistrelle, which is a small-size, 
insectivorous bat with body weight 4 to 10 g. It is considered the most common bat 
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species found in urban areas of Hong Kong (85). While it is abundant in wetland areas, 
its roosts are frequently found in towns and villages, as well as various types of buildings 
and other man-made structures, such as fans or air-conditioners. It is also known to utilize 
bat houses or boxes as its roosts. Such diverse habitat and adaptability to harsh 
environments may have favored the mutation of Pi-BatCoV HKU5 especially in its S 
protein which is responsible for receptor binding and immunogenicity. Interestingly, this 
bat species is not only widely distributed in China, Russia, Korea, Japan, Vietnam, 
Burma and India, but also the Kingdom of Saudi Arabia and neighboring countries (42, 
85). Moreover, other Pipistrellus bats including P. arabicus, P. ariel, P. kuhlii, P. 
pipistrellus, P. rueppellii and P. savii have been recorded in the Arabian Peninsula 
( www.iucn.org ). In fact, the partial sequences closely related to MERS-CoV detected in 
bats from Europe were also originated from Pipstrellus bats (P. pipistrellus, P. nathusii 
and P. pygmaeus) of the family Vespertilionidae, and those from Ghana were originated 
from Nycteris bats ( Nycteris cf. gambiensis) of the related family Nycteridae (44). 
Similarly, the bat betaCoV strain related to MERS-CoV detected in Meixco was 
originated from a N. laticaudatus bat belonging to Molossidae, a closely related family of 
Vespertilionidae (45, 86). The difference between this bat betCoV and MERS-CoV 
within the partial nspl4 sequence was also found to be mainly due to substitutions in the 
3 rd nucleotide positions, suggesting strong purifying selection (45). However, S gene 
sequences were not available from these bat viruses for further analysis of 
polymorphisms and selective pressures. Nevertheless, based on our existing data, bats 
belonging to Vespertilionidae and related families, especially Pipistrellus bats and those 
with diverse habitats, in the Arabian Peninsula should be intensively sought for potential 
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ancestral viruses of MERS-CoV, which may have evolved through mutations in the S 
gene especially in the RBD, allowing efficient transmission to other animals or human. In 
contrast, lesser bamboo bats, the host species for Ty-BatCoV HKU4 and one of the 
smallest mammals in the world with body weight 3 to 7 g, have much more restricted 
habitats. Though this species also belongs to the family Vespertilionidae, it is remarkably 
adapted to roost inside bamboo stems, and is mainly found in rural areas in Hong Kong 
and various Asian countries (85). This may, in turn, reflect the lower mutation rate 
observed in the S gene of Ty-BatCoV HKU4. 

It remains to be determined if Ty-BatCoV HKU4 and Pi-BatCoV HKU5, as well 
as other lineage C betaCoVs in bats, utilize the same receptor as MERS-CoV. Recent 
studies have shown that MERS-CoV utilizes DPP4 as its functional receptor (47, 79). 
This suggested that these betaCoVs belonging to lineage C may utilize receptor(s) 
different from those of other CoVs. Moreover, expression of bat (P. pipistrellus ) DPP4 in 
non-susceptible cells was found to enable infection by MERS-CoV (47), which is in line 
with the ability of the virus to replicate in cell lines from Rousettus, Rhinolophus, 
Pipistrellus, Myotis, and Carollia bats (79). As DPP4 is a evolutionarily conserved 
protein (47), it may also explain the broad species tropism observed in primate, porcine, 
and rabbit cell lines and reflect the zoonotic origin of MERS-CoV (46, 79). However, Ty- 
BatCoV HKU4 and Pi-BatCoV HKU5, as with other bat CoVs, have not been 
successfully cultured in vitro, which hampers studies on their receptor binding and host 
adaptation. Further discoveries of lineage C betaCoVs in animals and studies on the 
receptors of the different animal counterparts in their respective hosts may help 
understand the mechanism of interspecies transmission and emergence of MERS-CoV. 
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420 Bats are increasingly recognized as reservoir for various zoonotic viruses 

421 including SARS CoV, lyssaviras, rabies virus, Hendra, Nipah, Ebola as well as influenza 

422 virus (87, 88). While the existence of CoVs in bats was unknown before the SARS 

423 epidemic, it is now known that the different bat populations harbor diverse Co Vs, which 

424 is likely the result of their species diversity, roosting behavior and migrating ability (16, 

425 18, 29, 31, 32, 89). These warm-blooded flying vertebrates are also ideal hosts to fuel 

426 CoV recombination and dissemination (5, 27, 59). It remains to be ascertained if bats 

427 could also be the animal origin for the emergence of MERS-CoV either directly or via an 

428 intermediate host, the latter as in the case of SARS CoV where the bat ancestral virus 

429 may have jumped to the intermediate host when bats are in contact or mixed with other 

430 animals (16). Since history of contact with animals such as camels and goats has been 

431 reported in MERS-CoV-infected cases (90), the virus may have jumped from bats to 

432 these animals before infecting humans. Surveillance studies of lineage C betaCoVs from 

433 bats and other animals in the Middle East may help identify the origin and chain of 

434 transmission of MERS-CoV. 

435 
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LEGENDS TO FIGURES 


FIG 1 Phylogenetic analysis of RdRp, S and N genes of Ty-BatCoV HKU4 and Pi- 
BatCoV HKU5 strains, and those of other betaCoVs with available complete genome 
sequences. The trees were constructed by maximum-likelihood method with bootstrap 
values calculated from 100 trees. 937, 1535, and 546 aa positions in RdRp, S, and N 
genes respectively were included in the analysis. The scale bar indicates the estimated 
number of substitutions per 5 or 20 aa. HCoV-HKUl, human coronavims HKU1, HCoV- 
OC43, human coronavirus OC43; MHV, murine hepatitis virus; BCoV, bovine 
coronavirus; PHEV, porcine hemagglutinating encephalomyelitis virus; GiCoV, giraffe 
coronavims; RCoV, rat coronavirus; ECoV, equine coronavirus; RbCoV HKU14, rabbit 
coronavims HKU14; AntelopeCoV, sable antelope coronavirus; SARS-CoV, SARS 
coronavims; SARSr-Rh-BatCoV HKU3, SARS-related Rhinolophus bat coronavirus 
HKU3; SARSr-CiCoV, SAR-related civet coronavims; SARSr CoV CFB, SARS-related 
Chinese ferret badger coronavims; Ty-BatCoV HKU4, Tylonycteris bat coronavirus 
HKU4; Pi-BatCoV HKU5, Pipistrellus bat coronavirus HKU5; MERS-CoV EMC, 
Middle East Respiratory Syndrome Coronavims EMC; MERS-CoV Englandl, Middle 
East Respiratory Syndrome Coronavirus Englandl; Ro-BatCoV HKU9, Rousettus bat 
coronavirus HKU9. 

FIG 2 Distribution of amino acid changes in the spike protein of Ty-BatCoV HKU4 
(upper panel) and Pi-BatCoV HKU5 (lower panel). The positions of the amino acid 
changes are depicted by vertical lines. SS, predicted signal peptide; RBD, receptor 
binding domain; HR1, heptad repeat 1; HR2, heptad repeat 2; TM, transmembrane 
domain. 
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805 

806 FIG 3 Graphical representation of multiple sequence alignment showing the amino acid 

807 changes in the spike protein of Pi-BatCoV HKU5. The height of symbols indicates the 

808 relative frequency of each amino acid at the position. Polar amino acids are indicated in 

809 green; neutral amino acids are indicated in purple; basic amino acids are indicated in blue; 

810 acidic amino acids are indicated in red; hydrophobic amino acids are indicated in black. 

811 The figure was generated using WebLogo (91). 

812 FIG 4 Distribution of positively selected sites in S proteins identified using REL in Ty- 

813 BatCoV HKU4 (upper panel) and Pi-BatCoV HKU5 (lower panel). Positively selected 

814 sites with posterior probability greater than 0.5 are shown. 

815 FIG 5 Estimation of the tMRCA of Ty-BatCoV HKU4 and Pi-BatCoV HKU5. The time- 

816 scaled phylogeny was summarized from all MCMC phylogenies of the (A) complete 

817 RdRp, (B) complete N and (C) 904-bp RdRp sequence data set analyzed under the 

818 relaxed clock model with an exponential distribution (Uced) in BEAST v 1.7.4. Viruses 

819 characterized in this study are bolded. 
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820 


TABLE 1 Primers used in this study 

Coronaviruses Primers 


Ty-BatCoV HKU4 
RdRp 


Forward 


LPW3283 

LPW2771 

LPW2626 

LPW2738 

LPW3233 

LPW1507 

LPW1037 

LPW3235 


5 ’-GTAATGTCTGTCAGTATTGGGTT-3 ’ 

5 ’ -T GYT AY GCTTT AMGNC A YTT Y GA-3 ’ 

5 ’ -GTTTT AAC ACTY GAT AA Y GARG A-3 ’ 

5 ’-CCACCCTAATTGTGTTAATTGTA-3 ’ 

5 ’ -GGC AATTTT AAT A AAG ATTTTT AT GA-3' 
5 ’-GGTTGGGACTATCCTAAGTGTGA-3 ’ 

5 ’-WTATKTKAARCC WGGTGG-3 ’ 

5 ’ -CTT AAT AAAC ACTTTTCT AT GAT GAT -3 ’ 


Backward 


821 

- 822 - 


LPW3232 

LPW2773 

LPW2630 

LPW2775 

LPW3234 

LPW1508 

LPW1040 

LPW2678 


5 ’ - AACT AAT ATGCT CTTT AAC ACTT C AC-3 ’ 
5 ’ -GTTGGGT AAT AAC AAAATC ACC AA-3 ’ 

5 ’ - AGTAT ATT G AARTTN GC AC ART G-3 ’ 

5 ’ -T AACT G AAGACCCTT CCTTG AAA-3 ’ 

5 ’ -GCC AAAAT C AAT G ACGCTAAAAT-3 ’ 

5 ’ -CC AT CATC AG AT AG AATC ATC AT A-3 ’ 

5 ’-KYDB WRTTRTARCAMACAAC-3 ’ 

5 ’-TACTCACCGAGCTGTACTTTACTA-3 ’ 


S 


LPW3797 5 ’ - AG ATTT ATAT AA AATT AT GGG AA-3 ’ 
LPW3 899 5 ’-TCTCTTACTAATACATCGGCT-3 ’ 
LPW4103 5 ’ -TGGTGCAAACCAAGATGTTGAAA-3 ’ 
LPW3720 5’-CATTAGTAGTTAGTGATTGTAAA-3’ 
LPW2319 5 ’- ATTAATGCTAGAGAYCTHMTTTG-3 ’ 
LPW2824 5’-TTTGCCGCTATACCTTTTGCACAA-3 ’ 
LPW4105 5’ -T.ATTAGTGACATCCTTGCTAGGCTT-3 
LPW4107 5 ’ - AT GGTCCT AACTTTGC AGAG AT A-3 ’ 


LPW4102 5 ’ -TACGTGGTTTTAATATGCAATAAAA-3' 
LPW3900 5 ’ -AAGACCTGACCATCTTCAGAAA-3 ’ 
LPW3712 5 ’ -CT AGCGCT AT A ACTT CT AAAAGT A-3 ’ 
LPW2821 5 ’ -GTC AT A AAGT GGTGGT A AAACTT-3 ’ 
LPW2320 5 ’-TTTGGGTAACTCCAATNCCRTT-3 ’ 
LPW4106 5 ’ -TG AGTT AT AGGTTC AGGTTT AT AA-3 ’ 
LPW2317 5’-GAGCCAAACATACCANGGCCAYTT-3’ 
LPW21565 5’-TGCCAGACATGCCACCACAA-3’ 


N_ 

Pi-BatCoV HKU5 
RdRp 


LPW21407 5 ’ - AAC G A AT CTT AAT AACT CATTGTT-3 ’ 


LPW3350 5’-TTTGTCAATTTTGGATAGGACAT-3’ 
LPW3351 5’-ATCAGAATAACTGTGAAGTGCTT-3’ 
LPW3 3 82 5 ’ -C AAATT GT GTG AACTGT ACT GAT-3 ’ 
LPW3172 5 ’-GTCCTGGCAACTTTAATAAAGATT-3 
LPW1507 5 ’-GGTTGGGACTATCCTAAGTGTGA-3 ’ 
LP W3 3 84 5 ’ -CT AAATTTGTGGAC AGGT ATT AT-3 ’ 


LPW21408 5’-CTCTTGTTACTCTTCATTGGCAT-3’ 


LPW3352 5 ’ -TG AT GC AT C AC AGC ARCC ATA-3 ’ 
LPW3275 5’-GACAATTGGACCAAAAGACGTT-3’ 
LPW33 87 5 ’- ATATATCTCGAAGTAACGATCAA-3 ’ 
LPW3130 5 ’ -CT AAT AT G AGAG ATGC AAAG A-3 ’ 
LPW1508 5 ’ -CC AT CATC AG AT AG AATC ATC AT A-3 ; 
LPW3399 5 ’ -CTTCGT AT AC ACGT ACC AC AA-3 ’ 


S 


LPW21416 5’-CTCTTGTCGCAGGGTAAACTT-3’ 
LPW4086 5 ’-TAACTTATACTGGACTGTACCCAAA-3’ 
LPW4192 5 ’ - ACTTTGCT ACTTT ACCTGT GT AT -3 ’ 
LPW4285 5 ’ - AAT CGCC ACTCTAAACTTT ACT A-3 ’ 

LP W413 8 5 ’ - AAG ATG AGTCTATTGCT AATCTAT-3 ’ 
LPW4287 5 ’ -T GT GC AC AAT AT GTT GCTGGCTA-3 ’ 
LPW4140 5 ’ - AAC ACT GAGAATCC ACC AAA-3 ’ 


LP W4284 5 ’ - AAAGACTCTACCTGTGC AG AAT A-3 ’ 
LPW4193 5 ’-AAGCCATTTGAAGGTTACCATT-3 ’ 
LPW4137 5’-AGTAACACCAAATGTGAAATT-3’ 
LPW4286 5’-AAGAGGCTGGGTATTCTGGGTT-3’ 
LPW4139 5 ’ - AGCTT CC AT AT AGGGGTC AT A-3 ’ 
LPW4288 5’-AAAGAACTACCAGTATAATACCAA-3’ 
LPW21417 5’-CACACGCATCATAAGTTCGTT-3’ 


N 


LPW213615 ’-GAATCTTATTATCTCATTGTT-3 


LPW21362 5 ’ -CT ATT ACGTTC A ATT GGC A AT-3 ’ 


40 






823 TABLE 2 Detection of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 in bats by RT-PCR _ 

Bats _ 

Scientific name Common name No. of bats No. (%) of bats positive for CoV No. (%) of bats positive for CoV 

tested in respiratory samples in alimentary samples 


Ty-BatCoV 

Pi-BatCoV 

Ty-BatCoV 

Pi-BatCoV 

HKU4 

HKU5 

HKU4 

HKU5 


Megachiroptera 

Pteropodidae 

Cynopterus sphinx 

Short-nosed fruit bat 

26 

0(0) 

0(0) 

0(0) 

0(0) 

Rousettus leschenaulti 

Leschenaulf s rousette 

73 

0(0) 

0(0) 

0(0) 

0(0) 

Microch iroptera 

Hipposideridae 

Hipposideros armiger 

Himalayan leaf-nosed bat 

198 

0(0) 

0(0) 

0(0) 

0(0) 

Hipposideros pomona 

Pomona leaf-nosed bat 

642 

0(0) 

0(0) 

0(0) 

0(0) 

Rhinolophidae 

Rhinolophus affinus 

Intermediate horseshoe bat 

359 

0(0) 

0(0) 

0(0) 

0(0) 

Rhinolophus pusillus 

Least horseshoe bat 

89 

0(0) 

0(0) 

0(0) 

0(0) 

Rhinolophus sinicus 

Chinese horseshoe bat 

2012 

0(0) 

0(0) 

0(0) 

0(0) 

Vespertilionidae 

Hypsugo pulveratus 

Chinese pipistrelle 

1 

0(0) 

0(0) 

0(0) 

0(0) 

Miniopterus magnater 

Greater bent-winged bat 

15 

0(0) 

0(0) 

0(0) 

0(0) 

Miniopterus pusillus 

Lesser bent-winged bat 

450 

0(0) 

0(0) 

0(0) 

0(0) 

Miniopterus schreibersii 

Common bent-winged bat 

758 

0(0) 

0(0) 

0(0) 

0(0) 

Myotis chinensis 

Chinese myotis 

122 

0(0) 

0(0) 

0(0) 

0(0) 

Myotis horsjieldii 

Horsfield’s Bat 

7 

0(0) 

0(0) 

0(0) 

0(0) 

Myotis muricola 

Whiskered myotis 

4 

0(0) 

0(0) 

0(0) 

0(0) 

Myotis ricketti 

Ricketf s big-footed bat 

307 

0(0) 

0(0) 

0(0) 

0(0) 

Nyctalus noctula 

Brown noctule 

54 

0(0) 

0(0) 

0(0) 

0(0) 

Pipistrellus abramus 

Japanese pipistrelle 

219 

0(0) 

0(0) 

0(0) 

55 (25%) 

Pipistrellus tenuis 

Least pipistrelle 

11 

0(0) 

0(0) 

0(0) 

0(0) 

Scotophilus kuhlii 

Lesser yellow bat 

18 

0(0) 

0(0) 

0(0) 

0(0) 

Tylonycteris pachypus 

Lesser bamboo bat 

115 

0(0) 

0(0) 

29 (29%) 

0(0) 
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Tylonycteris robustula Greater bamboo bat _1_ Q (0) _ Q (0) _ Q (Q) _ 0 (Q) 
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825 TABLE 3 Pairwise amino acid identities between the RdRp, S and N genes of Ty-BatCoV HKU4, Pi-BatCoV HKU5 and MERS- 

826 CoV to those of other betaCoVs 


Coronaviruses 

Pairwise amino acid identity (%) 








Ty-BatCoV HKU4 2 


Pi-BatCoV HK.U5 31 


MERS-CoV 




RdRp 

S 

N 

RdRp 

S 

N 

RdRp 

S 

N 

Betacoronavirus lineage A 

HCoV-OC43 

68.8 

33.4 

33.2 

68.7 

31.2 

34.2 

68.3 

32 

35.3 

BCoV 

68.7 

33.5 

33.2 

68.6 

31.3 

34.8 

68.2 

31.3 

35.6 

PHEV 

68.8 

33.2 

32.7 

68.7 

31.2 

33.9 

68.3 

32.5 

35.1 

GiCoV 

68.7 

33.9 

32.8 

68.6 

31.6 

34.8 

68.2 

31.4 

35.3 

RCoV 

68.8 

32.4 

33.7 

68.8 

31.4 

34.3 

68.7 

32 

34.8 

RbCoV HK.U14 

68 

33.8 

33.2 

68 

30.9 

34.9 

68 

32.2 

35.3 

AntelopeCoV 

68.7 

33.7 

32.8 

68.6 

31.2 

34.8 

68.2 

31.4 

35.3 

ECoV 

69.1 

32.4 

34.9 

68.7 

31.5 

35.6 

68.3 

31.6 

35.7 

MHV 

68.7 

32.7 

34.1 

68.8 

31.9 

34.7 

68.6 

31.5 

34.3 

HCoV-HKUl 

67.6 

32.1 

32.8 

68.1 

30.2 

33.3 

67.9 

31.8 

32.3 

Betacoronavirus lineage B 

SARS-CoV 

71.6 

33.6 

45.8 

71.8 

33.5 

43.6 

71.9 

31.6 

46.6 

SARSr-Rh-BatCoV HKU3 

71.7 

33.6 

45.2 

71.7 

32.8 

43.9 

71.8 

30.6 

46.2 

Betacoronavirus lineage C 

Ty-BatCoV HKU4 

99.5-100 

97.3-99.6 

99.5-100 

92-92.5 

67.7-68.1 

73.5-74 

89.6-90 

66.8-67.4 

71.9-72.3 

Pi-BatCoV HK.U5 

92.1-92.4 

67.5-68.4 

73.7-75.1 

99.4-99.7 

88.3-97 

97.2-98.6 

92.1-92.3 

63.4-64.5 

69.5-70.5 

MERS-CoV 

89.9 

67.3-67.4 

71.6-72.1 

92.1 

64.3 

68.8-69.5 




Betacoronavirus lineage D 

Ro-BatCoV HKU9 

69.3 

30.8 

37.3 

68.7 

31 

36.9 

68.4 

30.3 

37.8 
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828 TABLE 4 Estimation of non-synonymous and synonymous substitution rates in the 

829 RdRp, S and N genes of Ty-BatCoV HKU4, Pi-BatCoV HKU5 and MERS-CoV 

830 _ 

Gene Ty-BatCoV HKU4 Pi-BatCoV HKU5 MERS-CoV 


(18 strains)(19 strains)(2 strains) 



Ka 

Ks 

Ka/Ks 

Ka 

Ks 

Ka/Ks 

Ka 

Ks 

Ka/Ks 

RdRp 

0.001 

0.033 

0.03 

0.001 

0.128 

0.0078 

0 

0.006 

0 

S 

0.004 

0.034 

0.118 

0.038 

0.321 

0.118 

0.001 

0.008 

0.125 

N 

0.001 

0.019 

0.053 

0.005 

0.095 

0.053 

0.002 

0.010 

0.2 
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